An Evaluation Resource for Geographic Information Retrieval

نویسندگان

  • Thomas Mandl
  • Fredric C. Gey
  • Giorgio Maria Di Nunzio
  • Nicola Ferro
  • Mark Sanderson
  • Diana Santos
  • Christa Womser-Hacker
چکیده

In this paper we present an evaluation resource for geographic information retrieval developed within the Cross Language Evaluation Forum (CLEF). The GeoCLEF track is dedicated to the evaluation of geographic information retrieval systems. The resource encompasses more than 600,000 documents, 75 topics so far, and more than 100,000 relevance judgments for these topics. Geographic information retrieval requires an evaluation resource which represents realistic information needs and which is geographically challenging. Some experimental results and analysis are reported. 1. Geographic Information Retrieval Evaluation The Cross Language Evaluation Forum 1 (CLEF) is a large European evaluation initiative dedicated to crosslanguage retrieval for European languages [Peters et al. 2004]. CLEF was implemented as a consequence to the rising need for crossand multi-lingual retrieval research and applications. CLEF provides a multi-lingual testbed for retrieval experiments. The evaluation campaign of CLEF comprises several components: the evaluation methodology, the evaluation software packages, the data collections, the topics, the overall results of the participants, the assessed results of the participants, and the calculated statistical results. GeoCLEFF 2 was the first track at an evaluation campaign dedicated to evaluating geographic information retrieval (GIR) systems ever. The aim of GeoCLEF is the provision of the necessary framework for the evaluation of GIR systems for search tasks involving both spatial and multilingual aspects. Participants are offered a TREC style ad-hoc retrieval task based on newspaper collections. GeoCLEF started as a pilot track in 2005 [Gey et al. 2006] and was a regular CLEF track since then [Gey et al. 2007, Mandl et al. 2008]. GeoCLEF evaluates the retrieval of documents with an emphasis on geographic information retrieval from text. Geographic search requires the combination of spatial and content based relevance into one result. Many research and evaluation issues surrounding geographic monoand bilingual search have been addressed in GeoCLEF. It is still an open research question how to best combine semantic knowledge on geographic relations with vague document representations [Chaves et al 2005] as well as how to encode place knowledge in NLP [Santos & Chaves 2006]. Especially the multilingual aspect of geographic retrieval is not trivial [Gey & Carl 2004]. 1 http://www.clef-campaign.org 2 http://www.uni-hildesheim.de/geoclef 2. Evaluation Resources Geographical Information Retrieval (GIR) concerns the retrieval of information involving some kind of spatial awareness. Many documents contain some kind of spatial reference which may be important for IR. For example, to retrieve, rank and visualize search results based on a spatial dimension (e.g. “find me news stories about bush fires near Sidney”). Many challenges of geographic IR involve geographical references (geo-references) which systems need to recognized and treated properly. Documents contain geo-references expressed in multiple languages which may or may not be the same as the query language. For example, the city Cape Town (English) is also Kapstadt (German), Cidade do Cabo in Portuguese and Ciudad del Cabo (Spanish). For 2007, Portuguese, German and English were available as document and topic languages. There were two Geographic Information Retrieval tasks: monolingual (English to English, German to German and Portuguese to Portuguese) and bilingual (language X to language Y, where X or Y was one of English, German or Portuguese). In the first three editions of GeoCLEF, 75 topics with relevance assessments have been developed. Thus, GeoCLEF has developed a standard evaluation collection which supports long-term research. Topic creation is a collaborative activity of the three organizing groups, who all utilize the DIRECT System provided by the University of Padua [Agosti et al. 2007]. DIRECT has been designed to extend the current IR methodology in order to provide an integrated vision of the scientific data involved in an international evaluation campaign. It offers tools to support tasks related to different areas such as, for example, the creation of the topics and the management of relevance assessments. A search utility for the collections is provided to facilitate the interactive exploration of potential topics. Each group initially created initial versions of nine proposed topics in their language, with subsequent translation into English. Topics are meant to express a natural information need which a user of the collection might have. These candidates were subsequently checked for relevant documents in the other collections. In many cases, topics needed to be refined. For example, the topic candidate honorary doctorate degrees at Scottish universities was expanded to topic GC53 scientific research at Scottish universities due to an initial lack of documents in the German and Portuguese collections. After the translation, all topics were thoroughly checked. An example of a topic in the three languages is shown below: 10.2452/63-GC Water quality along coastlines of the Mediterranean Sea Find documents on the water quality at the coast of the Mediterranean Sea Relevant documents report on the water quality along the coast and coastlines of the Mediterranean Sea. The coasts must be specified by their names. 10.2452/63-GC Qualidade da água na costa mediterrânica Os documentos devem referir a qualidade da água nas praias ou costas do Mediterrâneo. As zonas a que se refere essa qualidade têm de figurar no documento. 10.2452/63-GC Wasserqualität an der Küste des Mittelmeers Dokumente über die Wasserqualität an Küsten im Mittelmeer Relevante Dokumente berichten von der Wasserqualität im Mittelmeer in Zusammenhang mit den Namen der Küsten und Küstenabschnitte, an denen die Verschmutzungen aufgetreten sind. The organizers aimed at creating a geographically challenging topic set. This means that explicit geographic knowledge should be necessary in order for the participants to successfully retrieve relevant documents. Keywordbased approaches only should not be favored by the topics. While many geographic searches may be well served by keyword approaches, others require a profound geographic reasoning. We speculate that, for a realistic topic set where these difficulties might be less common, most systems could perform better. In order to achieve a geographically challenging topic set, several difficulties were explicitly included in the topics of GeoCLEF 2006 and 2007: • Ambiguity (a church called St. Pauls Cathedral, exists in London and São Paulo) • Vague geographic regions (Near East) • Geographical relations beyond IN (near Russian cities, along Mediterranean Coast) • Cross-lingual issues (Greater Lisbon , Portuguese: Grande Lisboa , German: Großraum Lissabon) • Granularity below the country level (French speaking part of Switzerland, Northern Italy) • Complex region shapes (along the rivers Danube and Rhine) • Differences between local and national newspapers (local events are not often mentioned in national newspapers of other countries) However, it was often difficult to develop multilingual topics which fulfilled these criteria. For example, local events which allow queries on a level of granularity below the country often do not lead to newspaper articles outside the national press. This makes the development of cross-lingual topics difficult. The topics are used by the systems to produce results which are then joined in a document pool which is evaluated by human assessors. The spatial dimension is an additional factor in this relevance judgment process. Documents need to be relevant and geographically adequate. The participants used a wide variety of approaches to the GeoCLEF tasks, ranging from basic IR approaches (with no attempts at spatial or geographic reasoning or indexing) to deep natural language processing (NLP) processing to extract place and topological clues from the texts and queries. Specific techniques used included (see more details in the overview paper Mandl et al. 2008): • Ad-hoc techniques (weighting, probabilistic retrieval, language model, blind relevance feedback ) • Semantic analysis (annotation and inference) • Geographic knowledge bases (gazetteers, thesauri, ontologies) • Text mining • Query expansion techniques (e.g. geographic feedback) • Geographic Named Entity Extraction • Geographic disambiguation • Geographic scope and relevance models • Geographic relation analysis • Geographic entity type analysis • Term expansion using Wordnet • Part-of-speech tagging The relevance judgments posed several problems, illustrated here in detail for the "free elections in Africa" topic: What is part of an election (or presupposed by it)? In other words, which parts are necessary or sufficient to consider that a text talks about elections: campaign, direct results, who were the winners, "tomada de posse", speeches when receiving the power, cabinet constitution, balance after one month, after a longer period?. 3. GeoCLEF Collection The document collections for 2007 GeoCLEF experiments consisted of newspaper and newswire stories from the years 1994 and 1995 used in previous CLEF ad-hoc evaluations. The Portuguese, English and German collections contain stories covering international and national news events, therefore representing a wide variety of geographical regions and places. The English document collection contains 169,477 documents and is composed of stories from the British newspaper The Glasgow Herald (1995) and the American newspaper The Los Angeles Times (1994). The German document collection consists of 294,809 documents from the German news magazine Der Spiegel (1994/95), the German newspaper Frankfurter Rundschau (1994) and the Swiss newswire agency Schweizer Depeschen Agentur (SDA, 1994/95). For Portuguese, GeoCLEF 2007 utilized two newspaper collections, spanning over 1994-1995, for respectively the Portuguese and Brazilian newspapers Público (106,821 documents) and Folha de São Paulo (103,913 documents). Both are major daily newspapers in their countries. Not all material published by the two newspapers is included in the collections (mainly for copyright reasons), but every day is represented with documents. The Portuguese collections are also distributed for IR and NLP research by Linguateca as the CHAVE collection 3 , recently distributed with automatic syntactic annotation as well. The English and German collections are available in a CLEF package from ELDA/ELRA. GeoCLEF Year Collection Languages Topic Languages 2005 (pilot) English, German English, German 2006 English, German, Portuguese, Spanish English, German, Portuguese, Spanish, Japanese 2007 English, German, Portuguese English, German, Portuguese, Spanish, Indonesian 2008 (planned) English, German, Portuguese English, German, Portuguese Table 1: GeoCLEF 2007 test collection size. In all collections, the documents have a common structure: newspaper-specific information like date, page, issue, special filing numbers and usually one or more titles, a byline and the actual text. The document collections were not geographically tagged and contained no semantic location-specific information. 3 http://www.linguateca.pt/CHAVE/ Language English German Portuguese Number of documents 169,477 294,809 210,734 Table 2: GeoCLEF 2007 test collection size. A query classification task has also been conducted. The challenge for systems was the identification of the geographic queries within a real search engine query log and the recognition of the geographic and the thematic parts (Li et al. 2008). Training and test data labeled by humans was created as the test environment.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Review of ranked-based and unranked-based metrics for determining the effectiveness of search engines

Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...

متن کامل

Water Quality Evaluation of Bamdezh Wetland Using Combination of NSFWQI and Geographic Information System

Bamdezh wetland with the geographical coordinates of   north longitude, east latitude and an area of 44 square Kilometers, is located in about 40 km northwest of Ahvaz and Shavur River is the main source of its supply. This wetland is not only a habitat and suitable food source for aquatic and migratory birds, but as well a significant resource of income for the locals. Today, with the arrival ...

متن کامل

Water Quality Evaluation of Bamdezh Wetland Using Combination of NSFWQI and Geographic Information System

Bamdezh wetland with the geographical coordinates of   north longitude, east latitude and an area of 44 square Kilometers, is located in about 40 km northwest of Ahvaz and Shavur River is the main source of its supply. This wetland is not only a habitat and suitable food source for aquatic and migratory birds, but as well a significant resource of income for the locals. Today, with the arrival ...

متن کامل

Performance Evaluation of Medical Image Retrieval Systems Based on a Systematic Review of the Current Literature

Background and Aim: Image, as a kind of information vehicle which can convey a large volume of information, is important especially in medicine field. Existence of different attributes of image features and various search algorithms in medical image retrieval systems and lack of an authority to evaluate the quality of retrieval systems, make a systematic review in medical image retrieval system...

متن کامل

Challenges to Evaluation of Multilingual Geographic Information Retrieval in GeoCLEF

This is the third year of the evaluation of geographic information retrieval (GeoCLEF) within the Cross-Language Evaluation Forum (CLEF). GeoCLEF 2006 presented topics and documents in four languages (English, German, Portuguese and Spanish). After two years of evaluation we are beginning to understand the challenges to both Geographic Information Retrieval from text and of evaluation of the re...

متن کامل

UNSW at GeoCLEF 2006

This paper describes our participation in the GeoCLEF monolingual English task of the Cross Language Evaluation Forum 2006. Our retrieval system consists of four modules: the geographic knowledge base; the indexing module; the document retrieval module and the ranking module. The geographic knowledge base provides information about important geographic entities around the world and relationship...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008